5  Hypothesis testing

Author

Vladimir Buskin

5.1 Preparation

  • Load packages:
library("readxl")
library("tidyverse")
  • Load the data sets:
data <- read_xlsx("Paquot_Larsson_2020_data.xlsx")

data_vowels <- read.csv("Vowels_Apache.csv", sep = "\t")

5.2 Hypothesis testing

The first step is to define the null hypothesis \(H_0\) and the alternative hypothesis \(H_1\) (or \(H_a\)).

Given two categorical variables \(X\) and \(Y\), we assume under \(H_0\) that both variables are independent from each other. This hypothesis describes the “default state of the world” (James et al. 2021: 555), i.e., what we would usually expect to see. By contrast, the alternative hypothesis \(H_1\) states that \(X\) and \(Y\) are not independent, i.e., that \(H_0\) does not hold.

In this unit, we will consider two scenarios:

  1. We are interested in finding out whether English clause ORDER (‘sc-mc’ or ‘mc-sc’) depends on the type of the subordinate clause (SUBORDTYPE), which can be either temporal (‘temp’) or causal (‘caus’).

Our hypotheses are:

  • \(H_0:\) The variables ORDER and SUBORDTYPE are independent.

  • \(H_1:\) The variables ORDER and SUBORDTYPE are not independent.

  1. As part of a phonetic study, we compare the base frequencies of the F1 formants for male and female speakers of Apache. We forward the following hypotheses:
  • \(H_0:\) mean F1 frequency of men \(=\) mean F1 frequency of women.

  • \(H_1:\) mean F1 frequency of men \(\ne\) mean F1 frequency of women.

Based on our data, we can decide to either accept or reject \(H_0\). Rejecting \(H_0\) can be viewed as evidence in favour of \(H_1\) and thus marks a potential ‘discovery’ in the data. However, there is always a chance that we accept or reject the wrong hypothesis; the four possible constellations are summarised in the table below (cf. Heumann, Schomaker, and Shalabh 2022: 223):

\(H_0\) is true \(H_0\) is not true
\(H_0\) is not rejected \(\color{green}{\text{Correct decision}}\) \(\color{red}{\text{Type II } (\beta)\text{-error}}\)
\(H_0\) is rejected \(\color{red}{\text{Type I } (\alpha)\text{-error}}\) \(\color{green}{\text{Correct decision}}\)

The probability of a Type I error, which refers to the rejection of \(H_0\) although it is true, is called the significance level \(\alpha\), which has a conventional value of \(0.05\) (i.e., a 5% chance of committing a Type I error).

5.3 Constructing the critical region

An important question remains: How great should the difference be for us to reject \(H_0\)? The \(p\)-value measures the probability of encountering a specific value of a test statistic under the assumption that \(H_0\) holds. For example, a \(p\)-value of \(0.02\) means that we would see a particular \(\chi^2\)-score (or \(T\), \(F\) etc.) only 2% of the time if \(X\) and \(Y\) were unrelated (or if there was no difference between \(\bar{x}\) and \(\bar{y}\), respectively). Since our significance level \(\alpha\) is set to \(0.05\), we only reject the null hypothesis if this probability is lower than 5%.

We obtain \(p\)-values by consulting the probability density functions of the underlying distributions:

  • Probability density function for the \(\chi^2\)-distribution with \(df = 1\)
Code
# Generate random samples from a chi-squared distribution with 1 degree of freedom
x <- rchisq(100000, df = 1)

# Create histogram
hist(x,
     breaks = "Scott",
     freq = FALSE,
     xlim = c(0, 20),
     ylim = c(0, 0.2),
     ylab = "Probability density of observing a specific score",
     xlab = "Chi-squared score",
     main = "Histogram for a chi-squared distribution with 1 degree of freedom (df)",
     cex.main = 0.9)

# Overlay PDF
curve(dchisq(x, df = 1), from = 0, to = 150, n = 5000, col = "orange", lwd = 2, add = TRUE)

  • Probability density function for the \(t\)-distribution with \(df = 112.19\)
Code
# Given t-statistic and degrees of freedom
t_statistic <- 2.4416
df <- 112.19

# Generate random samples from a t-distribution with the given degrees of freedom
x <- rt(100000, df = df)

# Create histogram
hist(x,
     breaks = "Scott",
     freq = FALSE,
     xlim = c(-5, 5),
     ylim = c(0, 0.4),
     ylab = "Probability density of observing a specific score",
     xlab = "t-score",
     main = "Histogram for a t-distribution with 112.19 degrees of freedom",
     cex.main = 0.9)

# Overlay PDF
curve(dt(x, df = df), from = -5, to = 5, n = 5000, col = "orange", lwd = 2, add = TRUE)